128 research outputs found
Object Discovery via Cohesion Measurement
Color and intensity are two important components in an image. Usually, groups
of image pixels, which are similar in color or intensity, are an informative
representation for an object. They are therefore particularly suitable for
computer vision tasks, such as saliency detection and object proposal
generation. However, image pixels, which share a similar real-world color, may
be quite different since colors are often distorted by intensity. In this
paper, we reinvestigate the affinity matrices originally used in image
segmentation methods based on spectral clustering. A new affinity matrix, which
is robust to color distortions, is formulated for object discovery. Moreover, a
Cohesion Measurement (CM) for object regions is also derived based on the
formulated affinity matrix. Based on the new Cohesion Measurement, a novel
object discovery method is proposed to discover objects latent in an image by
utilizing the eigenvectors of the affinity matrix. Then we apply the proposed
method to both saliency detection and object proposal generation. Experimental
results on several evaluation benchmarks demonstrate that the proposed CM based
method has achieved promising performance for these two tasks.Comment: 14 pages, 14 figure
Disentangled Contrastive Image Translation for Nighttime Surveillance
Nighttime surveillance suffers from degradation due to poor illumination and
arduous human annotations. It is challengable and remains a security risk at
night. Existing methods rely on multi-spectral images to perceive objects in
the dark, which are troubled by low resolution and color absence. We argue that
the ultimate solution for nighttime surveillance is night-to-day translation,
or Night2Day, which aims to translate a surveillance scene from nighttime to
the daytime while maintaining semantic consistency. To achieve this, this paper
presents a Disentangled Contrastive (DiCo) learning method. Specifically, to
address the poor and complex illumination in the nighttime scenes, we propose a
learnable physical prior, i.e., the color invariant, which provides a stable
perception of a highly dynamic night environment and can be incorporated into
the learning pipeline of neural networks. Targeting the surveillance scenes, we
develop a disentangled representation, which is an auxiliary pretext task that
separates surveillance scenes into the foreground and background with
contrastive learning. Such a strategy can extract the semantics without
supervision and boost our model to achieve instance-aware translation. Finally,
we incorporate all the modules above into generative adversarial networks and
achieve high-fidelity translation. This paper also contributes a new
surveillance dataset called NightSuR. It includes six scenes to support the
study on nighttime surveillance. This dataset collects nighttime images with
different properties of nighttime environments, such as flare and extreme
darkness. Extensive experiments demonstrate that our method outperforms
existing works significantly. The dataset and source code will be released on
GitHub soon.Comment: Submitted to TI
Low-Light Hyperspectral Image Enhancement
Due to inadequate energy captured by the hyperspectral camera sensor in poor
illumination conditions, low-light hyperspectral images (HSIs) usually suffer
from low visibility, spectral distortion, and various noises. A range of HSI
restoration methods have been developed, yet their effectiveness in enhancing
low-light HSIs is constrained. This work focuses on the low-light HSI
enhancement task, which aims to reveal the spatial-spectral information hidden
in darkened areas. To facilitate the development of low-light HSI processing,
we collect a low-light HSI (LHSI) dataset of both indoor and outdoor scenes.
Based on Laplacian pyramid decomposition and reconstruction, we developed an
end-to-end data-driven low-light HSI enhancement (HSIE) approach trained on the
LHSI dataset. With the observation that illumination is related to the
low-frequency component of HSI, while textural details are closely correlated
to the high-frequency component, the proposed HSIE is designed to have two
branches. The illumination enhancement branch is adopted to enlighten the
low-frequency component with reduced resolution. The high-frequency refinement
branch is utilized for refining the high-frequency component via a predicted
mask. In addition, to improve information flow and boost performance, we
introduce an effective channel attention block (CAB) with residual dense
connection, which served as the basic block of the illumination enhancement
branch. The effectiveness and efficiency of HSIE both in quantitative
assessment measures and visual effects are demonstrated by experimental results
on the LHSI dataset. According to the classification performance on the remote
sensing Indian Pines dataset, downstream tasks benefit from the enhanced HSI.
Datasets and codes are available:
\href{https://github.com/guanguanboy/HSIE}{https://github.com/guanguanboy/HSIE}
Propagate And Calibrate: Real-time Passive Non-line-of-sight Tracking
Non-line-of-sight (NLOS) tracking has drawn increasing attention in recent
years, due to its ability to detect object motion out of sight. Most previous
works on NLOS tracking rely on active illumination, e.g., laser, and suffer
from high cost and elaborate experimental conditions. Besides, these techniques
are still far from practical application due to oversimplified settings. In
contrast, we propose a purely passive method to track a person walking in an
invisible room by only observing a relay wall, which is more in line with real
application scenarios, e.g., security. To excavate imperceptible changes in
videos of the relay wall, we introduce difference frames as an essential
carrier of temporal-local motion messages. In addition, we propose PAC-Net,
which consists of alternating propagation and calibration, making it capable of
leveraging both dynamic and static messages on a frame-level granularity. To
evaluate the proposed method, we build and publish the first dynamic passive
NLOS tracking dataset, NLOS-Track, which fills the vacuum of realistic NLOS
datasets. NLOS-Track contains thousands of NLOS video clips and corresponding
trajectories. Both real-shot and synthetic data are included. Our codes and
dataset are available at https://againstentropy.github.io/NLOS-Track/.Comment: CVPR 2023 camera-ready version. Codes and dataset are available at
https://againstentropy.github.io/NLOS-Track
Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction
This paper addresses the problem of rolling shutter correction in complex
nonlinear and dynamic scenes with extreme occlusion. Existing methods suffer
from two main drawbacks. Firstly, they face challenges in estimating the
accurate correction field due to the uniform velocity assumption, leading to
significant image correction errors under complex motion. Secondly, the drastic
occlusion in dynamic scenes prevents current solutions from achieving better
image quality because of the inherent difficulties in aligning and aggregating
multiple frames. To tackle these challenges, we model the curvilinear
trajectory of pixels analytically and propose a geometry-based Quadratic
Rolling Shutter (QRS) motion solver, which precisely estimates the high-order
correction field of individual pixels. Besides, to reconstruct high-quality
occlusion frames in dynamic scenes, we present a 3D video architecture that
effectively Aligns and Aggregates multi-frame context, namely, RSA2-Net. We
evaluate our method across a broad range of cameras and video sequences,
demonstrating its significant superiority. Specifically, our method surpasses
the state-of-the-art by +4.98, +0.77, and +4.33 of PSNR on Carla-RS, Fastec-RS,
and BS-RSC datasets, respectively. Code is available at
https://github.com/DelinQu/qrsc.Comment: accepted at ICCV 202
- …